ERANNs: Efficient residual audio neural networks for audio pattern recognition

نویسندگان

چکیده

Audio pattern recognition (APR) is an important research topic and can be applied to several fields related our lives. Therefore, accurate efficient APR systems need developed as they are useful in real applications. In this paper, we propose a new convolutional neural network (CNN) architecture method for improving the inference speed of CNN-based tasks. Moreover, using proposed method, improve performance systems, confirmed experiments conducted on four audio datasets. addition, investigate impact data augmentation techniques transfer learning systems. Our best system achieves mean average precision (mAP) 0.450 AudioSet dataset. Although value less than that state-of-the-art system, 7.1x faster 9.7x smaller. On ESC-50, UrbanSound8K, RAVDESS datasets, obtain results with accuracies 0.961, 0.908, 0.748, respectively. ESC-50 dataset 1.7x 2.3x smaller previous system. For dataset, 3.3x We name "Efficient Residual Neural Networks".

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Audio Chord Recognition with Recurrent Neural Networks

In this paper, we present an audio chord recognition system based on a recurrent neural network. The audio features are obtained from a deep neural network optimized with a combination of chromagram targets and chord information, and aggregated over different time scales. Contrarily to other existing approaches, our system incorporates acoustic and musicological models under a single training o...

متن کامل

Efficient Neural Audio Synthesis

Sequential models achieve state-of-the-art results in audio, visual and textual domains with respect to both estimating the data distribution and generating high-quality samples. Efficient sampling for this class of models has however remained an elusive problem. With a focus on text-to-speech synthesis, we describe a set of general techniques for reducing sampling time while maintaining high o...

متن کامل

Precision Scaling of Neural Networks for Efficient Audio Processing

While deep neural networks have shown powerful performance in many audio applications, their large computation and memory demand has been a challenge for real-time processing. In this paper, we study the impact of scaling the precision of neural networks on the performance of two common audio processing tasks, namely, voice-activity detection and single-channel speech enhancement. We determine ...

متن کامل

ANN Paradigms for Audio Pattern Recognition

Pattern Recognition is the process to classify data or patterns based on either a priori knowledge or on statistical information extracted from the patterns. An audio pattern recognition problem is based on speech patterns spoken, which can be interpreted as speaker dependent or speaker independent. Artificial Neural Network (ANN) is information processing machine learning model, inspired by bi...

متن کامل

Audio Visual Speech Recognition Using Deep Recurrent Neural Networks

In this work, we propose a training algorithm for an audiovisual automatic speech recognition (AV-ASR) system using deep recurrent neural network (RNN).First, we train a deep RNN acoustic model with a Connectionist Temporal Classification (CTC) objective function. The frame labels obtained from the acoustic model are then used to perform a non-linear dimensionality reduction of the visual featu...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Pattern Recognition Letters

سال: 2022

ISSN: ['1872-7344', '0167-8655']

DOI: https://doi.org/10.1016/j.patrec.2022.07.012